Fix OpenType Context Substitution Format 3 and decomposition type handling for Arabic/emoji shaping#824
Fix OpenType Context Substitution Format 3 and decomposition type handling for Arabic/emoji shaping#824fjobeir wants to merge 17 commits intoopentypejs:masterfrom
Conversation
Replace contextSubstitutionFormat3 with implementation that uses lookupCoverageList and getLookupByIndex for proper GSUB lookup type 5 substFormat 3 handling. Some Arabic fonts (e.g. IBM Plex Sans Arabic) use this lookup format for contextual letter forms. https://claude.ai/code/session_0192MAXgejpjkKBgChuyTFcd
Fix OpenType Context Substitution Format 3 for Arabic font shaping
Each lookupRecord should only substitute the glyph at its sequenceIndex, not iterate all input lookups. This caused duplicate results (e.g. [54,54,54,54] instead of [54,54]). https://claude.ai/code/session_0192MAXgejpjkKBgChuyTFcd
Fix contextSubstitutionFormat3 producing duplicate substitutions
The chaining context substitution handler only supported type '12' (single substitution format 2) and threw on all other types. Some fonts (e.g. noto-emoji) use type '21' (multiple/decomposition substitution) within chaining context lookups. https://claude.ai/code/session_0192MAXgejpjkKBgChuyTFcd
Handle decomposition substitution type 21 in chainingSubstitutionFormat3
Same fix as contextSubstitutionFormat3: each lookupRecord should only substitute the glyph at its sequenceIndex, not iterate all input lookups. Fixes duplicate glyph output in emoji/flag shaping. https://claude.ai/code/session_0192MAXgejpjkKBgChuyTFcd
Fix chainingSubstitutionFormat3 to use lookupRecord.sequenceIndex
Context substitution format 3 only handled nested lookup type '12' (single substitution). Some fonts (e.g. noto-emoji) use type '21' (multiple/decomposition substitution) in context substitution lookup records. Without handling type '21', the second covered glyph was not processed by the context substitution, causing it to be re-matched and duplicated in the ccmp pipeline. https://claude.ai/code/session_0192MAXgejpjkKBgChuyTFcd
Keep @shuding/opentype.js at 1.4.0-beta.0 (the published version). The opentype.js Context Substitution Format 3 fix is tracked separately in opentypejs/opentype.js#824. Restore webkit-text-stroke snapshots to match the published dependency. https://claude.ai/code/session_0192MAXgejpjkKBgChuyTFcd
|
Hi @fdb would you please take a look whenever possible |
|
Hi @fjobeir, I looked at this. Some notes:
Ideally we have two things:
// In chainingSubstitutionFormat3, change:
if (substitution) substitutions.push(...substitution);
// to:
if (substitution) substitutions.push(substitution);
The existing chainingSubstitutionFormat3 handles type '71' (extension subtables). These are wrappers that exist purely to allow 32-bit offsets in tables that would otherwise be limited to 16-bit, they don't change the substitution logic, they just redirect to the real subtable data. The existing code handles this by unwrapping Sorry for the long review, but GSUB is a tricky beast 😁 If you need any further help, lmk |
- Remove version field from package.json (handled separately by maintainer) - Fix type 21 spread inconsistency: use substitutions.push(substitution) instead of push(...substitution) in chainingSubstitutionFormat3, matching contextSubstitutionFormat3 behavior - Add extension subtable (type 71) unwrapping in contextSubstitutionFormat3, matching existing chainingSubstitutionFormat3 pattern - Fix contextSubstitutionFormat3 to check each coverage against the glyph at its corresponding position instead of using lookupCoverageList (which only checks contextParams.current against all coverages) - Add synthetic test font with GSUB type 5 format 3 (context substitution) with coverages.length=2 and distinct sequenceIndex values, verifying: - [A, B] → [A', B'] produces correct output (not duplicated) - Short context returns empty - Non-matching glyphs return empty - Single lookupRecord only substitutes at its sequenceIndex https://claude.ai/code/session_0192MAXgejpjkKBgChuyTFcd
Address PR review feedback
Removed version '1.4.0-beta.1' from package-lock.json.
Description
Replace the contextSubstitutionFormat3 implementation in src/features/featureQuery.mjs with one that uses lookupCoverageList and the standard lookup infrastructure (getLookupByIndex, getLookupMethod, getSubstitutionType) for proper GSUB lookup type 5, substFormat 3 handling. Additionally, fix both contextSubstitutionFormat3 and chainingSubstitutionFormat3 to use lookupRecord.sequenceIndex instead of iterating all input lookups, and add support for decomposition substitution type 21 in both functions.
Also adds a version field (1.4.0-beta.1) to package.json to support downstream dependency resolution.
Motivation and Context
Some Arabic fonts (e.g. IBM Plex Sans Arabic) use OpenType Context Substitution Format 3 (GSUB lookup type 5, substFormat 3) for contextual letter forms. The previous implementation bypassed the standard lookup infrastructure and could fail to correctly resolve substitutions. Additionally, both contextSubstitutionFormat3 and chainingSubstitutionFormat3 iterated all input lookups for each lookup record instead of targeting the specific glyph at sequenceIndex, causing duplicate glyph output. Fonts like noto-emoji that use nested decomposition substitution (type 21) within context/chaining lookups also failed because only type 12 was handled.
This is part of a broader effort to fix RTL (Arabic/Hebrew) text rendering in Next.js OG image generation, which uses Satori -> opentype.js for font shaping.
How Has This Been Tested?
Screenshots (if appropriate):
N/A — changes affect font shaping logic, validated via automated tests.
Types of changes
Checklist:
npm run testand all tests passed green (including code styling checks).